AITopics | local linear convergence

Local Linear Convergence of Forward--Backward under Partial Smoothness

Neural Information Processing SystemsSep-30-2025, 10:32:42 GMT

In this paper, we consider the Forward--Backward proximal splitting algorithm to minimize the sum of two proper closed convex functions, one of which having a Lipschitz continuous gradient and the other being partly smooth relatively to an active manifold $\mathcal{M}$. We propose a generic framework in which we show that the Forward--Backward (i) correctly identifies the active manifold $\mathcal{M}$ in a finite number of iterations, and then (ii) enters a local linear convergence regime that we characterize precisely. This gives a grounded and unified explanation to the typical behaviour that has been observed numerically for many problems encompassed in our framework, including the Lasso, the group Lasso, the fused Lasso and the nuclear norm regularization to name a few. These results may have numerous applications including in signal/image processing processing, sparse recovery and machine learning.

backward, forward, local linear convergence, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.61)
Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Local Linear Convergence of Forward--Backward under Partial Smoothness

Jingwei Liang, Jalal Fadili, Gabriel Peyré

Neural Information Processing SystemsFeb-9-2025, 03:01:23 GMT

In this paper, we consider the Forward-Backward proximal splitting algorithm to minimize the sum of two proper closed convex functions, one of which having a Lipschitz continuous gradient and the other being partly smooth relative to an active manifold M. We propose a generic framework under which we show that the Forward-Backward (i) correctly identifies the active manifold M in a finite number of iterations, and then (ii) enters a local linear convergence regime that we characterize precisely. This gives a grounded and unified explanation to the typical behaviour that has been observed numerically for many problems encompassed in our framework, including the Lasso, the group Lasso, the fused Lasso and the nuclear norm regularization to name a few. These results may have numerous applications including in signal/image processing processing, sparse recovery and machine learning.

artificial intelligence, assumption, optimization problem, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Local Linear Convergence of Forward--Backward under Partial Smoothness

Neural Information Processing SystemsJan-17-2025, 18:42:45 GMT

In this paper, we consider the Forward--Backward proximal splitting algorithm to minimize the sum of two proper closed convex functions, one of which having a Lipschitz continuous gradient and the other being partly smooth relatively to an active manifold \mathcal{M} . We propose a generic framework in which we show that the Forward--Backward (i) correctly identifies the active manifold \mathcal{M} in a finite number of iterations, and then (ii) enters a local linear convergence regime that we characterize precisely. This gives a grounded and unified explanation to the typical behaviour that has been observed numerically for many problems encompassed in our framework, including the Lasso, the group Lasso, the fused Lasso and the nuclear norm regularization to name a few. These results may have numerous applications including in signal/image processing processing, sparse recovery and machine learning.

forward, local linear convergence, partial smoothness, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.66)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Local Linear Convergence of Forward-Backward under Partial Smoothness

Neural Information Processing SystemsMar-13-2024, 08:30:47 GMT

In this paper, we consider the Forward-Backward proximal splitting algorithm to minimize the sum of two proper closed convex functions, one of which having a Lipschitz continuous gradient and the other being partly smooth relative to an active manifold M. We propose a generic framework under which we show that the Forward-Backward (i) correctly identifies the active manifold M in a finite number of iterations, and then (ii) enters a local linear convergence regime that we characterize precisely. This gives a grounded and unified explanation to the typical behaviour that has been observed numerically for many problems encompassed in our framework, including the Lasso, the group Lasso, the fused Lasso and the nuclear norm regularization to name a few. These results may have numerous applications including in signal/image processing processing, sparse recovery and machine learning.

assumption, smooth function, theorem 3, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Fast global convergence of gradient descent for low-rank matrix approximation

Chen, Hengchao, Chen, Xin, Elmasri, Mohamad, Sun, Qiang

arXiv.org Machine LearningMay-30-2023

This paper investigates gradient descent for solving low-rank matrix approximation problems. We begin by establishing the local linear convergence of gradient descent for symmetric matrix approximation. Building on this result, we prove the rapid global convergence of gradient descent, particularly when initialized with small random values. Remarkably, we show that even with moderate random initialization, which includes small random initialization as a special case, gradient descent achieves fast global convergence in scenarios where the top eigenvalues are identical. Furthermore, we extend our analysis to address asymmetric matrix approximation problems and investigate the effectiveness of a retraction-free eigenspace computation method. Numerical experiments strongly support our theory. In particular, the retraction-free algorithm outperforms the corresponding Riemannian gradient descent method, resulting in a significant 29\% reduction in runtime.

artificial intelligence, gradient descent, machine learning, (13 more...)

arXiv.org Machine Learning

2305.19206

Country:

North America > Canada > Ontario > Toronto (0.28)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Can We Find Nash Equilibria at a Linear Rate in Markov Games?

Song, Zhuoqing, Lee, Jason D., Yang, Zhuoran

arXiv.org Artificial IntelligenceMar-2-2023

We study decentralized learning in two-player zero-sum discounted Markov games where the goal is to design a policy optimization algorithm for either agent satisfying two properties. First, the player does not need to know the policy of the opponent to update its policy. Second, when both players adopt the algorithm, their joint policy converges to a Nash equilibrium of the game. To this end, we construct a meta algorithm, dubbed as $\texttt{Homotopy-PO}$, which provably finds a Nash equilibrium at a global linear rate. In particular, $\texttt{Homotopy-PO}$ interweaves two base algorithms $\texttt{Local-Fast}$ and $\texttt{Global-Slow}$ via homotopy continuation. $\texttt{Local-Fast}$ is an algorithm that enjoys local linear convergence while $\texttt{Global-Slow}$ is an algorithm that converges globally but at a slower sublinear rate. By switching between these two base algorithms, $\texttt{Global-Slow}$ essentially serves as a ``guide'' which identifies a benign neighborhood where $\texttt{Local-Fast}$ enjoys fast convergence. However, since the exact size of such a neighborhood is unknown, we apply a doubling trick to switch between these two base algorithms. The switching scheme is delicately designed so that the aggregated performance of the algorithm is driven by $\texttt{Local-Fast}$. Furthermore, we prove that $\texttt{Local-Fast}$ and $\texttt{Global-Slow}$ can both be instantiated by variants of optimistic gradient descent/ascent (OGDA) method, which is of independent interest.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2303.03095

Country:

North America > United States (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Essex > Colchester (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Mathematics of Computing (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.65)
(2 more...)

Add feedback

Model identification and local linear convergence of coordinate descent

Klopfenstein, Quentin, Bertrand, Quentin, Gramfort, Alexandre, Salmon, Joseph, Vaiter, Samuel

arXiv.org Machine LearningOct-22-2020

For composite nonsmooth optimization problems, Forward-Backward algorithm achieves model identification (e.g. support identification for the Lasso) after a finite number of iterations, provided the objective function is regular enough. Results concerning coordinate descent are scarcer and model identification has only been shown for specific estimators, the support-vector machine for instance. In this work, we show that cyclic coordinate descent achieves model identification in finite time for a wide class of functions. In addition, we prove explicit local linear convergence rates for coordinate descent. Extensive experiments on various estimators and on real datasets demonstrate that these rates match well empirical results.

artificial intelligence, convergence, machine learning, (17 more...)

arXiv.org Machine Learning

2010.11825

Country:

Europe > France > Occitanie > Hérault > Montpellier (0.04)
Europe > France > Bourgogne-Franche-Comté > Côte-d'Or > Dijon (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)

Add feedback

Local Linear Convergence of Forward--Backward under Partial Smoothness

Liang, Jingwei, Fadili, Jalal, Peyré, Gabriel

Neural Information Processing SystemsFeb-14-2020, 08:58:56 GMT

In this paper, we consider the Forward--Backward proximal splitting algorithm to minimize the sum of two proper closed convex functions, one of which having a Lipschitz continuous gradient and the other being partly smooth relatively to an active manifold $\mathcal{M}$. We propose a generic framework in which we show that the Forward--Backward (i) correctly identifies the active manifold $\mathcal{M}$ in a finite number of iterations, and then (ii) enters a local linear convergence regime that we characterize precisely. This gives a grounded and unified explanation to the typical behaviour that has been observed numerically for many problems encompassed in our framework, including the Lasso, the group Lasso, the fused Lasso and the nuclear norm regularization to name a few. These results may have numerous applications including in signal/image processing processing, sparse recovery and machine learning. Papers published at the Neural Information Processing Systems Conference.

forward, local linear convergence, partial smoothness, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.72)
Information Technology > Sensing and Signal Processing > Image Processing (0.66)

Add feedback

Calculus of the exponent of Kurdyka-{\L}ojasiewicz inequality and its applications to linear convergence of first-order methods

Li, Guoyin, Pong, Ting Kei

arXiv.org Machine LearningJan-18-2018

In this paper, we study the Kurdyka-{\L}ojasiewicz (KL) exponent, an important quantity for analyzing the convergence rate of first-order methods. Specifically, we develop various calculus rules to deduce the KL exponent of new (possibly nonconvex and nonsmooth) functions formed from functions with known KL exponents. In addition, we show that the well-studied Luo-Tseng error bound together with a mild assumption on the separation of stationary values implies that the KL exponent is $\frac12$. The Luo-Tseng error bound is known to hold for a large class of concrete structured optimization problems, and thus we deduce the KL exponent of a large class of functions whose exponents were previously unknown. Building upon this and the calculus rules, we are then able to show that for many convex or nonconvex optimization models for applications such as sparse recovery, their objective function's KL exponent is $\frac12$. This includes the least squares problem with smoothly clipped absolute deviation (SCAD) regularization or minimax concave penalty (MCP) regularization and the logistic regression problem with $\ell_1$ regularization. Since many existing local convergence rate analysis for first-order methods in the nonconvex scenario relies on the KL exponent, our results enable us to obtain explicit convergence rate for various first-order methods when they are applied to a large variety of practical optimization models. Finally, we further illustrate how our results can be applied to establishing local linear convergence of the proximal gradient algorithm and the inertial proximal algorithm with constant step-sizes for some specific models that arise in sparse recovery.

artificial intelligence, exponent, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1007/s10208-017-9366-8

1602.02915

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)

Add feedback

Local Linear Convergence of Forward--Backward under Partial Smoothness

Liang, Jingwei, Fadili, Jalal, Peyré, Gabriel

Neural Information Processing SystemsDec-31-2014

In this paper, we consider the Forward--Backward proximal splitting algorithm to minimize the sum of two proper closed convex functions, one of which having a Lipschitz continuous gradient and the other being partly smooth relatively to an active manifold $\mathcal{M}$. We propose a generic framework in which we show that the Forward--Backward (i) correctly identifies the active manifold $\mathcal{M}$ in a finite number of iterations, and then (ii) enters a local linear convergence regime that we characterize precisely. This gives a grounded and unified explanation to the typical behaviour that has been observed numerically for many problems encompassed in our framework, including the Lasso, the group Lasso, the fused Lasso and the nuclear norm regularization to name a few. These results may have numerous applications including in signal/image processing processing, sparse recovery and machine learning.

artificial intelligence, assumption, optimization problem, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Filters

Collaborating Authors

local linear convergence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Local Linear Convergence of Forward--Backward under Partial Smoothness

Local Linear Convergence of Forward--Backward under Partial Smoothness

Local Linear Convergence of Forward--Backward under Partial Smoothness

Local Linear Convergence of Forward-Backward under Partial Smoothness

Fast global convergence of gradient descent for low-rank matrix approximation

Can We Find Nash Equilibria at a Linear Rate in Markov Games?

Model identification and local linear convergence of coordinate descent

Local Linear Convergence of Forward--Backward under Partial Smoothness

Calculus of the exponent of Kurdyka-{\L}ojasiewicz inequality and its applications to linear convergence of first-order methods

Local Linear Convergence of Forward--Backward under Partial Smoothness